Ch.11 Change of Basis

Return to TOC

Theorems

Related theorems for matrices and linear maps

For AMn×nA\in\mathcal{M}_{n\times n} the following are equivalent:

  1. AA has a right inverse in Mn×n\mathcal{M}_{n\times n}
  2. AA has a left inverse in Mn×n\mathcal{M}_{n\times n}
  3. rank(A)=n\text{rank}(A)=n
  4. nullity(A)=0\text{nullity}(A)=0
  5. AA is invertible

If f:VWf:V\to W is a linear map and dimension V=dimension W=n\text{dimension }V=\text{dimension } W=n, the following are equivalent:

  1. ff is onto (i.e. there is a map g:WVg:W\to V such that fg=IdWf\circ g=\text{Id}_W)
  2. ff is one-to-one (i.e. there is a map g:WVg:W\to V such that gf=IdVg\circ f=\text{Id}_V)
  3. rank(f)=n\text{rank}(f)=n
  4. nullity(f)=0\text{nullity}(f)=0
  5. ff is an isomorphism

Change of Basis

Changing the representation of a vector v\vec{v} from one basis to another.
The vector itself is the same, just the representations change. So,

the change of basis matrix for bases BB and DD is the identity map id:VV\text{id}:V\to V with respect to those bases.
RepB,D(id)=(RepD(β1)RepD(βn))\text{Rep}_{B,D}{(\text{id})}=\begin{pmatrix}\vdots&&\vdots\\\text{Rep}_D(\vec{\beta}_1)&\cdots&\text{Rep}_D(\vec{\beta}_n)\\\vdots&&\vdots\end{pmatrix}

This has the effect such that
RepB,D(id)RepB(v)=RepD(v)\text{Rep}_{B,D}(\text{id})\text{Rep}_B(\vec{v})=\text{Rep}_D(\vec{v})
And conversely, if a matrix MM satisfies
MRepB(v)=RepD(v)M\text{Rep}_B(\vec{v})=\text{Rep}_D(\vec{v})
then M is a change of basis matrix

Example 11.1

Find the change of basis matrix for the following bases B,DP2B,D\in\mathcal{P}_2
B=1,1+x,1+x+x2 D=x21,x,x2+1B=\langle1,1+x,1+x+x^2\rangle\space D=\langle x^2-1,x,x^2+1\rangle


Call the matrix MM. Since this is an identity map,
M(1)=RepD(1)=(1/201/2)M(1)=\text{Rep}_D(1)=\begin{pmatrix}-1/2\\0\\1/2\end{pmatrix}
M(1+x)=RepD(1+x)=(1/211/2)M(1+x)=\text{Rep}_D(1+x)=\begin{pmatrix}-1/2\\1\\1/2\end{pmatrix}
M(1+x+x2)=RepD(1+x+x2)=(011)M(1+x+x^2)=\text{Rep}_D(1+x+x^2)=\begin{pmatrix}0\\1\\1\end{pmatrix}
So,
RepB,D(id)=(1/21/200111/21/21)\text{Rep}_{B,D}(\text{id})=\begin{pmatrix}-1/2&-1/2&0\\0&1&1\\1/2&1/2&1\end{pmatrix}


Now, suppose we want to take v=3+2x+4x2\vec{v}=3+2x+4x^2
RepB(v)=(124)\text{Rep}_B(\vec{v})=\begin{pmatrix}1\\-2\\4\end{pmatrix}
Then we can change basis to DD by multiplying
RepD(v)=(1/21/200111/21/21)(124)=(1/227/2)\text{Rep}_D(\vec{v})=\begin{pmatrix}-1/2&-1/2&0\\0&1&1\\1/2&1/2&1\end{pmatrix}\begin{pmatrix}1\\-2\\4\end{pmatrix}=\begin{pmatrix}1/2\\2\\7/2\end{pmatrix}
And indeed,
1/2(x21)+2(x)+7/2(x2+1)=3+2x+4x2=v1/2(x^2-1)+2(x)+7/2(x^2+1)=3+2x+4x^2=\vec{v}

For a matrix MM,
MM changes basis \iffMM is nonsingular

Proof

For the forward direction, we must prove MM chnages basis \impliesMM is nonsingular.
Since changing from basis BB to DD is invertible (changing from basis DD to BB), the MM must also be invertible, therefore it is nonsingular.
For the reverse direction, we must prove MM is nonsingular \implies M is a change of basis matrix.
For this, since MM is nonsingular,it is a product of elementary reduction matrices (see Ch.10 for proof), so we only need to show that each elementary reduction matrix changes basis.
First, the matrix which multiplies row ii by a constant kk changes basis from β1,...,βi,...,βn\langle\vec{\beta}_1,...,\vec{\beta}_i,...,\vec{\beta}_n\rangle to β1,...,1kβi,...,βn\langle\vec{\beta}_1,...,\frac{1}{k}\vec{\beta}_i,...,\vec{\beta}_n\rangle
v=c1β1++ciβi++cnβn=c1β1++kci(1kβi)++cnβn=v\begin{array}{rcccl}\vec{v}&=&c_1\vec{\beta}_1+\cdots+c_i\vec{\beta}_i+\cdots+c_n\vec{\beta}_n&&\\&=&c_1\vec{\beta}_1+\cdots+kc_i(\frac{1}{k}\vec{\beta}_i)+\cdots+c_n\vec{\beta}_n&=&\vec{v}\end{array}

For the matrix that swaps two rows, the corresponding basis vectors also simply swap.

For the matrix that adds a multiple of one row to another (i.e. kci+cjkc_i+c_j), the basis changes from β1,...,βi,...,βj,...,βn\langle\vec{\beta}_1,...,\vec{\beta}_i,...,\vec{\beta}_j,...,\vec{\beta}_n\rangle to β1,...,βikβj,...,βj,...,βn\langle\vec{\beta}_1,...,\vec{\beta}_i-k\vec{\beta}_j,...,\vec{\beta}_j,...,\vec{\beta}_n\rangle
v=c1β1++ciβi++cjβj++cnβn=c1β1++ci(βikβj)++(kci+cj)βj++cnβn=v\begin{array}{rcccl}\vec{v}&=&c_1\vec{\beta}_1+\cdots+c_i\vec{\beta}_i+\cdots+c_j\vec{\beta}_j+\cdots+c_n\vec{\beta}_n&&\\&=&c_1\vec{\beta}_1+\cdots+c_i(\vec{\beta}_i-k\vec{\beta}_j)+\cdots+(kc_i+c_j)\vec{\beta}_j+\cdots+c_n\vec{\beta}_n&=&\vec{v}\end{array}
(notice that kciβjkc_i\vec{\beta}_j cancels out)
So, applying an elementary reduction matrix changes basis, so any invertible matrix must be a change of basis matrix.

Since a change of basis matrix is an identity matrix with repsect to two basis, we also have
MM is nonsingular \iffMM is an identity matrix between two bases


Changing map representations


The next thing to consider is changing the bases of a map from RepB,D(h)\text{Rep}_{B,D}(h) to RepB^,D^(h)\text{Rep}_{\hat{B},\hat{D}}(h)

There are two ways from taking a vector with respect to B^\hat{B} and mapping it to a vector with respect to D^\hat{D}:
1. RepB^idRepBHRepDidRepD^1.\space \text{Rep}_{\hat{B}}\xrightarrow{id}\text{Rep}_B\xrightarrow{H}\text{Rep}_D\xrightarrow{id}\text{Rep}_{\hat{D}}
2. RepB^H^RepD^2.\space \text{Rep}_{\hat{B}}\xrightarrow{\hat{H}}\text{Rep}_{\hat{D}}

From this, we can see:
H^=RepD,D^(id)HRepB^,B(id)\hat{H}=\text{Rep}_{D,\hat{D}}{(\text{id})}\cdot H\cdot\text{Rep}_{\hat{B},B}(\text{id})
In other words, to find the matrix representation of the map h:Vwrt B^Wwrt D^h:V_{wrt\space\hat{B}}\rightarrow W_{wrt\space\hat{D}}, multiply the change of basis matrix from B^\hat{B} to BB with the matrix representation of h:Vwrt BWwrt Dh:V_{wrt\space B}\rightarrow W_{wrt\space D}, then the change of basis matrix from DD from D^\hat{D}

Example 11.2

Consider the transformation map h:R3R3h:\mathbb{R}^3\to\mathbb{R}^3 represented by RepE3,E3(h)=H=(102011123)\text{Rep}_{\mathcal{E}_3,\mathcal{E}_3}(h)=H=\begin{pmatrix}1&0&2\\0&1&1\\1&2&3\end{pmatrix}
This represents a transformation that takes a vector (111)\begin{pmatrix}1\\1\\1\end{pmatrix} to a different vector like (326)\begin{pmatrix}3\\2\\6\end{pmatrix}
(note: both are with respect to the standard basis)
(111)h(326)\begin{pmatrix}1\\1\\1\end{pmatrix}\xmapsto{h}\begin{pmatrix}3\\2\\6\end{pmatrix}

Suppose we want to convert to the basis
B^=D^=(100),(110),(111)\hat{B}=\hat{D}=\langle\begin{pmatrix}1\\0\\0\end{pmatrix},\begin{pmatrix}1\\1\\0\end{pmatrix},\begin{pmatrix}1\\1\\1\end{pmatrix}\rangle
Under this basis, (111)E3=(001)B^\begin{pmatrix}1\\1\\1\end{pmatrix}_{\mathcal{E}_3}=\begin{pmatrix}0\\0\\1\end{pmatrix}_{\hat{B}} and (326)E3=(146)B^\begin{pmatrix}3\\2\\6\end{pmatrix}_{\mathcal{E}_3}=\begin{pmatrix}1\\-4\\6\end{pmatrix}_{\hat{B}} so
(001)B^h(146)B^\begin{pmatrix}0\\0\\1\end{pmatrix}_{\hat{B}}\xmapsto{h}\begin{pmatrix}1\\-4\\6\end{pmatrix}_{\hat{B}}
This is the same map, but under a different representation. We want to find the matrix representation of this map, H^\hat{H}


First find RepB^,E3(id)\text{Rep}_{\hat{B},\mathcal{E}_3}(\text{id})
(100)B^=(100)E3, (010)B^=(110)E3, (001)B^=(111)E3\begin{pmatrix}1\\0\\0\end{pmatrix}_{\hat{B}}=\begin{pmatrix}1\\0\\0\end{pmatrix}_{\mathcal{E}_3},\space\begin{pmatrix}0\\1\\0\end{pmatrix}_{\hat{B}}=\begin{pmatrix}1\\1\\0\end{pmatrix}_{\mathcal{E}_3},\space\begin{pmatrix}0\\0\\1\end{pmatrix}_{\hat{B}}=\begin{pmatrix}1\\1\\1\end{pmatrix}_{\mathcal{E}_3}
RepB^,E3(id)=(111011001)\implies\text{Rep}_{\hat{B},\mathcal{E}_3}(\text{id})=\begin{pmatrix}1&1&1\\0&1&1\\0&0&1\end{pmatrix}

Next we find RepE3,B^(id)\text{Rep}_{\mathcal{E}_3,\hat{B}}(\text{id})
(100)E3=(100)B^, (010)E3=(110)B^, (001)E3=(011)B^\begin{pmatrix}1\\0\\0\end{pmatrix}_{\mathcal{E}_3}=\begin{pmatrix}1\\0\\0\end{pmatrix}_{\hat{B}},\space\begin{pmatrix}0\\1\\0\end{pmatrix}_{\mathcal{E}_3}=\begin{pmatrix}-1\\1\\0\end{pmatrix}_{\hat{B}},\space\begin{pmatrix}0\\0\\1\end{pmatrix}_{\mathcal{E}_3}=\begin{pmatrix}0\\-1\\1\end{pmatrix}_{\hat{B}}
RepE3,B^(id)=(110011001)\implies\text{Rep}_{\mathcal{E}_3,\hat{B}}(\text{id})=\begin{pmatrix}1&-1&0\\0&1&-1\\0&0&1\end{pmatrix}
You can also use the fact that
RepE3,B^(id)=(RepB^,E3(id))1\text{Rep}_{\mathcal{E}_3,\hat{B}}(\text{id})=(\text{Rep}_{\hat{B},\mathcal{E}_3}({\text{id}}))^{-1}
since the inverse of id\text{id} is id\text{id}

Finally, we have
H^=(110011001)(102011123)(111011001)=(101124136)\hat{H}=\begin{pmatrix}1&-1&0\\0&1&-1\\0&0&1\end{pmatrix}\cdot \begin{pmatrix}1&0&2\\0&1&1\\1&2&3\end{pmatrix}\cdot\begin{pmatrix}1&1&1\\0&1&1\\0&0&1\end{pmatrix}=\begin{pmatrix}1&0&1\\-1&-2&-4\\1&3&6\end{pmatrix}

Same-sized matrices HH and H^\hat{H} are matrix equivalent if there exists nonsingular matrices PP and QQ such that H^=PHQ\hat{H}=PHQ
In other words, matrix equivalent matrices represent the same map under different bases; it is an equivalence relation

Naturally, we want to find the canonical form for the equivalence classes.
Any m×nm\times n matrix with rank kk is equivalent to the m×nm\times n matrix where every element is zero except the first kk diagonal entries being one
(1000001000001000000000000)\begin{pmatrix}1&0&\cdots&0&0&\cdots&0\\0&1&\cdots&0&0&\cdots&0\\\vdots&\vdots&&\vdots&\vdots&&\vdots&\\0&0&\cdots&1&0&\cdots&0\\0&0&\cdots&0&0&\cdots&0\\\vdots&\vdots&&\vdots&\vdots&&\vdots\\0&0&\cdots&0&0&\cdots&0\end{pmatrix}

It can be represented by the block partial-identity form
(IZZZ)=(1000001000001000000000000)\left(\begin{array}{c|c}I&Z\\\hline Z&Z\end{array}\right)=\left(\begin{array}{cccc|ccc}1&0&\cdots&0&0&\cdots&0\\0&1&\cdots&0&0&\cdots&0\\\vdots&\vdots&&\vdots&\vdots&&\vdots&\\0&0&\cdots&1&0&\cdots&0\\\hline0&0&\cdots&0&0&\cdots&0\\\vdots&\vdots&&\vdots&\vdots&&\vdots\\0&0&\cdots&0&0&\cdots&0\end{array}\right)
with II being the identity matrix and ZZ being the zero matrix

Two proofs below:

Proof 1

Theorem: For a linear map h:VWh:V\to W with rank kk, there exists bases BB and DD such that RepB,D(h)=(IkZZZ)\text{Rep}_{B,D}(h)=\left(\begin{array}{c|c}I_k&Z\\\hline Z&Z\end{array}\right)


Let VV have dimension nn and WW have dimension mm.
nullity(h)=nk\implies\text{nullity}(h)=n-k
\implies we can find nkn-k vectors βk+1,...,βn\vec{\beta}_{k+1},...,\vec{\beta}_n that form a basis for N(h)V\mathscr{N}(h)\subset V
\implies those vectors can be extended to B=β1,...,βk,βk+1,...,βnB=\langle\vec{\beta}_1,...,\vec{\beta}_k,\vec{\beta}_{k+1},...,\vec{\beta}_n\rangle
R(h)=span{h(β1),...,h(βk),h(βk+1),...,h(βn)}W\implies\mathscr{R}(h)=\text{span}\{h(\vec{\beta}_1),...,h(\vec{\beta}_k),h(\vec{\beta}_{k+1}),...,h(\vec{\beta}_n)\}\subset W
R(h)=span{h(β1),...,h(βk)}\implies\mathscr{R}(h)=\text{span}\{h(\vec{\beta}_1),...,h(\vec{\beta}_k)\} since h(βi)=0h(\vec{\beta}_i)=\vec{0} for i=k+1,...,ni=k+1,...,n
δ1=h(β1),...,δk=h(βk)\implies\vec{\delta}_1=h(\vec{\beta}_1),...,\vec{\delta}_k=h(\vec{\beta}_k) form a basis for R(h)\mathscr{R}(h)
\implies can be extended to basis D=δ1,...,δk,δk+1,...,δnD=\langle\vec{\delta}_1,...,\vec{\delta}_k,\vec{\delta}_{k+1},...,\vec{\delta}_n\rangle of WW
Thus, we have h(βi)=δih(\vec{\beta}_i)=\vec{\delta}_i for i=1,...,ki=1,...,k and h(βi)=0h(\vec{\beta}_i)=\vec{0} for i=k1,...,ni=k_1,...,n, so we have found BB and DD such that
RepB,D(h)=(IkZZZ)\text{Rep}_{B,D}(h)=\left(\begin{array}{c|c}I_k&Z\\\hline Z&Z\end{array}\right)
as desired.


Theorem: any m×nm\times n matrix is matrix equivalent to a m×nm\times n matrix of the form (IZZZ)\left(\begin{array}{c|c}I&Z\\\hline Z&Z\end{array}\right)


Suppose MM is an m×nm\times n matrix. For a linear map h:VWh:V\to W and appropriate bases B1B_1 of VV and D1D_1 of WW, we have M=RepB1,D1(h)M=\text{Rep}_{B_1,D_1}(h).
From the previous theorem, we can find bases BB of VV and DD of WW such that RepB,D(h)=(IkZZZ)\text{Rep}_{B,D}(h)=\left(\begin{array}{c|c}I_k&Z\\\hline Z&Z\end{array}\right).
These represent the same linear map under different bases, so they are matrix equivalent.

Proof 2

First recall the elementary row operation matrices act on rows if multiplied on the left and on columns if multiplied on the right.

For a matrix MM, we can apply row-reduction to get an (not reduced) echelon form matrix RR. Combine those reduction matrices (with right-to-left multiplication) to a single matrix PP. Thus, we have PM=RPM=R.
Then column-reduce RR to get a matrix in block partial-identity form. Combine those operations (with left-to-right multiplication) to get a matrix QQ. So, we have PMQ=RQ=(IZZZ)PMQ=RQ=\left(\begin{array}{c|c}I&Z\\\hline Z&Z\end{array}\right)

Example 11.3

Consider this rank 22 matrix
M=(123456789)M=\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}
We will find PP and QQ such that PMQPMQ is the canonical matrix of rank 22.
Row reducing gives
(123456789)7ρ1+ρ34ρ1+ρ2 2ρ2+ρ3 1/3ρ2(123012000)\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}\xrightarrow[-7\rho_1+\rho_3]{-4\rho_1+\rho_2}\space\xrightarrow{-2\rho_2+\rho_3}\space\xrightarrow{-1/3\rho_2}\begin{pmatrix}1&2&3\\0&1&2\\0&0&0\end{pmatrix}
So we have (note right to left)
P=(10001/30001)(100010021)(100410701)=(1004/31/30121)P=\begin{pmatrix}1&0&0\\0&-1/3&0\\0&0&1\end{pmatrix}\begin{pmatrix}1&0&0\\0&1&0\\0&-2&1\end{pmatrix}\begin{pmatrix}1&0&0\\-4&1&0\\-7&0&1\end{pmatrix}=\begin{pmatrix}1&0&0\\4/3&-1/3&0\\1&-2&1\end{pmatrix}
Next we perform column operations
(123012000)2col2+col3 col1+col32col1+col2(100010000)\begin{pmatrix}1&2&3\\0&1&2\\0&0&0\end{pmatrix}\xrightarrow{-2\text{col}_2+\text{col}_3}\space\xrightarrow[\text{col}_1+\text{col}_3]{-2\text{col}_1+\text{col}_2}\begin{pmatrix}1&0&0\\0&1&0\\0&0&0\end{pmatrix}
So (note left to right)
Q=(100012001)(121010001)=(121012001)Q=\begin{pmatrix}1&0&0\\0&1&-2\\0&0&1\end{pmatrix}\begin{pmatrix}1&-2&1\\0&1&0\\0&0&1\end{pmatrix}=\begin{pmatrix}1&-2&1\\0&1&-2\\0&0&1\end{pmatrix}
In sum, we have MM being row equivalent to
PMQ=(1004/31/30121)(123456789)(121012001)=(100010000)PMQ=\begin{pmatrix}1&0&0\\4/3&-1/3&0\\1&-2&1\end{pmatrix}\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}\begin{pmatrix}1&-2&1\\0&1&-2\\0&0&1\end{pmatrix}=\begin{pmatrix}1&0&0\\0&1&0\\0&0&0\end{pmatrix}

The effect of a block partial-identity matrix is easy to understand: for example,
(100010000)(xyz)=(xy0)\begin{pmatrix}1&0&0\\0&1&0\\0&0&0\end{pmatrix}\begin{pmatrix}x\\y\\z\end{pmatrix}=\begin{pmatrix}x\\y\\0\end{pmatrix}
is a projection

Matrix equivalence classes are characterized by the rank:
two matrices are equivalent iff they have the same rank
Proof: both are equivalent to the same block partial-identity matrix.

In particular, an n×nn\times n matrix is equivalent to InI_n iff it is invertible.